Data minimization for GDPR compliance in machine learning models

نویسندگان

چکیده

The EU General Data Protection Regulation (GDPR) and the California Privacy Rights Act (CPRA) mandate principle of data minimization, which requires that only necessary to fulfill a certain purpose be collected. However, it can often difficult determine minimal amount required, especially in complex machine learning models such as deep neural networks. We present first-of-a-kind method reduce personal needed perform predictions with model, by removing or generalizing some input features runtime data. Our makes use knowledge encoded within model produce generalization has little no impact on its accuracy, based distillation approaches. show that, cases, less may collected while preserving exact same level accuracy before, if small deviation is allowed, even more generalizations performed. also demonstrate when collecting dynamically, further improved. This enables organizations truly minimize collected, thus fulfilling minimization requirement set out regulations.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

Static Analysis for GDPR Compliance

Information systems might access, manage and record sensitive data about citizens. In addition, the pervasiveness of these systems is dramatically increasing and increasing thanks to the mobile and the IoT revolutions. However, several unintended data breaches are reported every week, and this might compromise the privacy, safety, and security of citizens. For all these reasons, the European Pa...

متن کامل

Modelling Provenance for GDPR Compliance using Linked Open Data Vocabularies

The upcoming General Data Protection Regulation (GDPR) requires justification of data activities to acquire, use, share, and store data using consent obtained from the user. Failure to comply may result in significant heavy fines which incentivises creation and maintenance of records for all activities involving consent and data. Compliance documentation therefore requires provenance informatio...

متن کامل

Some HCI Priorities for GDPR-Compliant Machine Learning

The General Data Protection Regulation: An Opportunity for the CHI Community? (CHI-GDPR 2018), Workshop at ACM CHI’18, 22 April 2018, Montréal, Canada Abstract In this short paper, we consider the roles of HCI in enabling the better governance of consequential machine learning systems using the rights and obligations laid out in the recent 2016 EU General Data Protection Regulation (GDPR)—a law...

متن کامل

a new approach to credibility premium for zero-inflated poisson models for panel data

هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: AI and ethics

سال: 2021

ISSN: ['2730-5953', '2730-5961']

DOI: https://doi.org/10.1007/s43681-021-00095-8